ECLogger: Cross-Project Catch-Block Logging Prediction Using Ensemble of Classifiers

نویسندگان

  • Sangeeta Lal
  • Neetu Sardana
  • Ashish Sureka
چکیده

Background: Software developers insert log statements in the source code to record program execution information. However, optimizing the number of log statements in the source code is challenging. Machine learning based within-project logging prediction tools, proposed in previous studies, may not be suitable for new or small software projects. As these software projects do not have sufficient prior training data to construct the prediction model. For such software projects, we can use cross-project logging prediction. Aim: The aim of the study presented is paper is to investigate cross-project logging prediction methods and techniques. Method: We propose ECLogger, which is a novel, ensemble-based, cross-project, catch-block logging prediction model. We use 9 base classifiers and combine them using 3 ensemble techniques to improve the cross-project logging prediction result. We evaluate the performance of ECLogger on three open-source Java projects: Tomcat, CloudStack and Hadoop. Results: ECLoggerBagging, ECLoggerAverageV ote, and ECLoggerMajorityV ote show a considerable improvement in the average LF in 3, 5, and 4 source→target project pairs, respectively, compared to the baseline classifiers. ECLoggerAverageV ote performs the best and shows improvements of 3.12% (average LF) and 6.08% (average ACC) compared to the baseline classifier. Conclusion: The classifier based on ensemble techniques such as bagging, average vote and majority vote outperforms the baseline classifier. Overall, the ECLoggerAverageV ote model performs the best.The results show that the CloudStack project is more generalizable than the other projects.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of ensemble learning techniques to model the atmospheric concentration of SO2

In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...

متن کامل

Fault Detection of Bearings Using a Rule-based Classifier Ensemble and Genetic Algorithm

This paper proposes a reduct construction method based on discernibility matrix simplification. The method works with genetic algorithm. To identify potential problems and prevent complete failure of bearings, a new method based on rule-based classifier ensemble is presented. Genetic algorithm is used for feature reduction. The generated rules of the reducts are used to build the candidate base...

متن کامل

Predicting the need for CT imaging in children with minor head injury using an ensemble of Naive Bayes classifiers

OBJECTIVE Using an automatic data-driven approach, this paper develops a prediction model that achieves more balanced performance (in terms of sensitivity and specificity) than the Canadian Assessment of Tomography for Childhood Head Injury (CATCH) rule, when predicting the need for computed tomography (CT) imaging of children after a minor head injury. METHODS AND MATERIALS CT is widely cons...

متن کامل

A Preprocessing Technique to Investigate the Stability of Multi-Objective Heuristic Ensemble Classifiers

Background and Objectives: According to the random nature of heuristic algorithms, stability analysis of heuristic ensemble classifiers has particular importance. Methods: The novelty of this paper is using a statistical method consists of Plackett-Burman design, and Taguchi for the first time to specify not only important parameters, but also optimal levels for them. Minitab and Design Expert ...

متن کامل

A Novel Ensemble Approach for Anomaly Detection in Wireless Sensor Networks Using Time-overlapped Sliding Windows

One of the most important issues concerning the sensor data in the Wireless Sensor Networks (WSNs) is the unexpected data which are acquired from the sensors. Today, there are numerous approaches for detecting anomalies in the WSNs, most of which are based on machine learning methods. In this research, we present a heuristic method based on the concept of “ensemble of classifiers” of data minin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • e-Informatica

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2017